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Abstract 



We consider the general polynomial optimization problem P : /* = 
min{/(x) : x £ K} where K is a compact basic semi-algebraic set. 
We first show that the standard Lagrangian relaxation yields a lower 
bound as close as desired to the global optimum /*, provided that it 
is applied to a problem P equivalent to P, in which sufficiently many 
redundant constraints (products of the initial ones) are added to the 
initial description of P. Next we show that the standard hierarchy 
of LP-relaxations of P (in the spirit of Sherali- Adams' RLT) can be 
interpreted as a brute force simplification of the above Lagrangian re- 
laxation in which a nonnegative polynomial (with coefficients to be 
determined) is replaced with a constant polynomial equal to zero. In- 
spired by this interpretation, we provide a systematic improvement 
of the LP-hierarchy by doing a much less brutal simplification which 
results into a parametrized hierarchy of semidefinite programs (and 
not linear programs any more). For each semidefinite program in the 
parametrized hierarchy, the semidefinite constraint has a fixed size 
0{n^), independently of the rank in the hierarchy, in contrast with 
the standard hierarchy of semidefinite relaxations. The parameter k 
is to be decided by the user. When applied to a non trivial class of 
convex problems, the first relaxation of the parametrized hierarchy is 
exact, in contrast with the LP-hierarchy where convergence cannot be 
finite. When applied to 0/1 programs it is at least as good as the first 
one in the hierarchy of semidefinite relaxations. However obstructions 
to exactness still exist and are briefly analyzed. Finally, the standard 
semidefinite hierarchy can also be viewed as a simplification of an ex- 
tended Lagrangian relaxation, but different in spirit as sums of squares 
(and not scalars) multipliers are allowed. 
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1 Introduction 

Recent years have seen the development of (global) semi-algebraic optimiza- 
tion and in particular LP- or semidefinite relaxations for the polynomial 
optimization problem: 

P: /* = min{/(x) : X G K } (1.1) 

X 

where / € M[x] is a polynomial and K C M" is the basic semi-algebraic set 
K = {xGM" : 5,(x) > 0, j = l,...,m}, (1.2) 
for some polynomials gj E ]R[x], j = 1, . . . , m. 

In particular, associated with P are two hierarchies of convex relaxations: 

- Semidefinite relaxations based on Putinar's certificate of positivity on 
K [16j . where the d-th convex relaxation of the hierarchy is a semidefinite 
program which solves the optimization problem 

n 

7d = max {t : f - t = ao + ^aj gj}. (1.3) 

J = l 

The unknowns aj are sums of squares polynomials with the degree bound 
constraint degree ajgj < 2d, j = 0, . . . , m, and the expression in (jl.3p is a 
certificate of positivity on K for the polynomial x i— t- /(x) — t. 

- LP-relaxations based on Krivine-Stengle's certificate of positivity on K 
[9l [19] , where the d-th convex relaxation of the hierarchy is a linear program 
which solves the optimization problem 



max {t : /-t = ^ X^p J] 

(o,/3)eN2™- \j=l 




(1.4) 



where N^"" = {(a,/3) G N^"^ : Y.j "i + < d}. The unknown are t and 
the nonnegative scalars A = {Xa/3), and it is assumed that < gj < 1 on 
K (possibly after scaling) and the family {gi,l — gi} generates the algebra 
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M[x] of polynomials. Problem (jl.4p is an LP because stating that the two 
polynomials in both sides of "=" are equal yields linear constraints on the 
Xa^'s. For instance, the LP-hierarchy from Sherali- Adams' RLT |17j and 
their variants [18] are of this form. See more details in ^3.31 

In both cases, (7^) and {9d), d € N, provide two monotone nondecreasing 
sequences of lower bounds on /* and if K is compact then both converge 
to /* as one let d increase. For more details as well as a comparison of 
such relaxations the interested reader is referred to e.g. Lasserre [12 i [lOj 
and Laurent [13], as well as Chlamtac and Tulsiani [5] for the impact of 
LP- and SDP-hierarchies on approximation algorithms in combinatorial op- 
timization. 

Of course, in principle, one would much prefer to solve LP-relaxations 
rather than semidefinite relaxations (i.e. compute 6d rather than 7^) because 
present LP-software packages can solve problems with millions of variables 
and constraints, which is far from being the case for semidefinite solvers. 
And so the hierarchy (jl.3p applies to problems of modest size only unless 
some sparsity or symmetry is taken into account in which case specialized 
variants can handle problems of much larger size. However, on the other 
hand, the LP-relaxations (jl.4p suffer from several serious theoretical and 
practical drawbacks. For instance, it has been shown in [101112] that the LP- 
relaxations cannot be exact for most convex problems, i.e., the sequence of 
the associated optimal values converges to the global optimum only asymp- 
totically and not in finitely many steps. Moreover, the LPs of the hierarchy 
are numerically ill-conditioned. This is in contrast with the semidefinite 
relaxations (|1.3p for which finite convergence takes place for convex prob- 
lems where V^/(x*) is positive definite at every minimizer x* € K (see de 
Klerk and Laurent [6l Corollary 3.3]) and occurs at the first relaxation for 
SOS-conves[ll problems [IH Theorem 3.3]. In fact, as demonstrated in recent 
works of Marshall [14] and Nie [15] . finite convergence is generic (even for 
non convex problems). 

So would it be possible to define a hierarchy of convex relaxations in 
between lil.3\) and jj.^p , i.e., with some of the nice features of the semidefi- 
nite relaxations but with a much less demanding computational effort (hence 
closer to the LP-relaxations)? This paper is a contribution in this direction. 

Contribution. This paper consists of two contributions: In the first 

^An SOS-convex polynomial is a convex polynomial whose Hessian factors as 
L(x)L(x)^ for some rectangular matrix polynomial L. For instance, separable convex 
polynomials are SOS-convex. 
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contribution which is of theoretical nature, we describe a new hierarchy of 
convex relaxations for P with the following feature. Each relaxation in the 
hierarchy is a finite-dimensional convex optimization problem of the form: 



Pd = max { Gd(A) : A > 0}, 



(1.5) 



where Gd{-) is the concave function defined by: 



Gd{X) := min {/(x) 

X 



(a,/3)eN2'" \i=l 




m 




(1.6) 



Therefore pd < /* for all d. And we prove that: 

(a) Pd > (^d for all d, and so pd — >■ /* as one let d increase. 

(b) For convex problems P, i.e., when /, —gj are convex, j = 1, . . . ,m, 
and Slater's condition holds, the convergence is finite and occurs at the first 
relaxation, i.e., pi = /*, in contrast with the LP-relaxations (jl.4p where 
convergence cannot be finite (and is very slow on simple trivial examples). 
In fact computing pi is just applying the standard dual method of multipliers 
(or Lagrangian relaxation) to the convex problem P. 

(c) For 0/1 optimization, i.e., when K C {0, 1}", finite convergence takes 
place and the optimal value pd provides a better lower bound than the one 
obtained with Sherali- Adams' RLT hierarchy [17]. In fact, the latter is 
solving (jl.4p with only a subset of the products that appear in (jl.4p . 

(d) Finally, (11. 5p has a nice interpretation in terms of the dual method of 
Non Linear Programming (or Lagrangian relaxation). To see this, consider 
the optimization problem P^; defined by: 



which has same value /* as P because P^ is just P with additional redundant 
constraints; and notice that Pi = P. Then solving (II. Sp is just applying the 
dual method of multipliers in Non Linear Programming to P^; see e.g. [H 
Chapter 8] . In general one obtains only a lower bound on the optimal value 
of Prf when P is not a convex program). And so our result states that the 
Lagrangian relaxation applied to P^ provides a lower bound as close to /* 



min{/(x) : <7,(x)"^ (1 - 5,(x))^^ > 0, (a,/3) GNi'"} 



(1.7) 
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as desired, provided that d is sufficiently large, i.e., provided that sufficiently 
many redundant constraints are added to the description of P. 

Note in passing that this provides a rigorous rationale for the well-known 
fact that adding redundant constraints helps for solving P. Indeed, even 
though the new problems P^, d € N, are all equivalent to P, their Lagrangian 
relaxations are not equivalent to that of P. 

Practical and computational considerations 

Our second contribution has a practical and algorithmic flavor. Even though 
p.5|) is a convex optimization problem, evaluating Gdi^) at a point A > 
requires computing the unconstrained global minimum of the function 



an NP-hard problem in general. After all, in principle the goal of Lagrangian 
relaxation is to end up with a problem which is easier to solve than P, and 
so, in this respect, the hierarchy (11.51) is not practical. 

So in this second part of the paper, we flrst show that the LP-relaxations 
()1.4p can be interpreted as a way to "restrict" and simplify the hierarchy 
(jl.Sp by a simple and brute force trick, so as to make it tractable (but 
of course less efficient). Namely, a certain nonnegative polynomial (whose 
coefficients have to be determined) is imposed to be the constant polynomial 
equal to zero! More precisely, the nonnegative vector A in (jl.Sp is restricted 
to a polytope so as to make the polynomial in (jl.Sp constant! In fact, 
if one had initially defined the LP-relaxations (jl.4p as this brute force (and 
even brutal) simplification of (jl.Sp . it would have been hard to justify. 

Inspired by this interpretation, we propose a systematic way to define 
improved versions of the LP-hierarchy (jl.4p by simplifying (jl.Sp in a much 
less brutal manner. We now impose the same nonnegative polynomial L^ — t 
to be an SOS polynomial of fixed degree 2k (rather than the zero polyno- 
mial in (jl.4p ). The increase of complexity is completely controlled by the 
parameter € N and is chosen by the user. That is, in the new resulting 
hierarchy (parametrized by A:), each LP of the hierarchy (jl.4p now becomes 
a semidefinite program but whose size of the semidefiniteness constraint is 
fixed and equal to ("^'^) , independently of the rank d in the hierarchy. (It is 




X 



(1.8) 
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known that crucial for solving semidefinite programs is the size of the LMIs 
involved rather than the number of variables.) The level A; = of complexity 
corresponds to the original LP-relaxations (|1.4p . the level k = 1 corresponds 
to a hierarchy of semidefinite programs with an Linear Matrix Inequality 
(LMI) of size (n + 1), etc. To fix ideas, let us mention that for k = 1, 
the first relaxation (i.e., (i = 1) is even stronger than the first relaxation of 
the hierarchy ()1.3p as it takes into account products of linear constraints; 
and so for instance, when applied to the celebrated MAXCUT problem, 
the first relaxation has the Goemans- Williamson's performance guarantee. 
Moereover, when k = 1 one obtains the so-called "Sherali- Adams + SDP" 
hierarchy already used for approximating some 0/1 optimization problems. 
So an important issue is: What do we gain by this increase of complexity? 

Of course, from a computational complexity point of view, one way got 
evaluate the efficiency of those relaxations is to analyze whether they help 
reduce integrality gaps, e.g. for some 0/1 optimization problems. For the 
level k = 1 (i.e. the "Adams-Sherali + SDP hierarchy") some negative 
results in this direction have been provided in Benabbas and Magen [2] , and 
in Benabbas et al. j^. 

But in a different point of view, we claim that a highly desirable prop- 
erty for a general purpose method (e.g., the hierarchies (jl.3p or (jl.4p ) aiming 
at solving NP-hard optimization problems, is to behave "efficiently" when 
applied to a class of problems considered relatively "easy" to solve. Oth- 
erwise one might raise reasonable doubts on its efficiency for more difficult 
problems, not only in a worst-case sense but also in "average". Convex 
problems P as in (ll.ip - ()1.2l) . i.e., when /, —gj are convex, form the most 
natural class of problems which are considered easy to solve by some stan- 
dard methods of Non Linear Programming; see e.g. Ben-tal and Nemirovski 
m. We have already proved that the hierarchy ()1.3p somehow recognizes 
convexity. For instance, finite convergence takes places as soon as V^/(x*) 
is positive definite at every global minimizer x* G K (see deKlerk and Lau- 
rent ^); moreover, SOS-convex programs are solved at the first step of the 
hierarchy as shown in Lasserre [TT]. On the other hand, the LP-hierarchy 
(jl.4p behaves poorly on such problems as the convergence cannot be finite; 
see e.g. Lasserre [T2| [TO]. 

We prove that the gain by this (controlled) increase of complexity is pre- 
cisely to permit finite convergence (and at the first step of the hierarchy) 
for a non trivial class of convex problems. For instance with k = 1 the re- 
sulting hierarchy of semidefinite programs solves convex quadratic programs 
exactly at the first step of the hierarchy. And more generally, for A; > 1, the 
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first relaxation is exact for SOS-conve4j problems of degree at most k. On 
the other hand, we show that for non convex problems, exactness at some 
relaxation in the hierarchy still implies restrictive conditions. 

2 Main result 

2.1 Notation and definitions 

Let M[x] be the ring of polynomials in the variables x = (xi, . . . , Denote 
by M[x]^ C M[x] the vector space of polynomials of degree at most d, which 
forms a vector space of dimension s{d) = ("^'^) , with e.g., the usual canonical 
basis (x") of monomials. Also, denote by S[x] C M[x] (resp. S[x](i C M[x]2d) 
the space of sums of squares (s.o.s.) polynomials (resp. s.o.s. polynomials 
of degree at most 2d). If / € IR[x]rf, write /(x) = ^^gpjn /ax" in the 

canonical basis and denote by f = (fa) G M*^'^^ its vector of coefficients. 
Finally, let 5" denote the space of n x n real symmetric matrices, with inner 
product (A,B) = trace AB, and where the notation A ^ (resp. A :^ 0) 
stands for A is positive semidefinite. With qq := 1, the quadratic module 
Q{gi, ■ ■ ■ , Qm) C M[x] generated by polynomials gi, . . . ,gm, is defined by 

m 

Q{9i,---,gni) ■■= (}2o-jgj ■ G s[x]}. 

j=0 

We briefly recall two important theorems by Putinar [T6] and Krivine- 
Stengle [H [19] respectively, on the representation of polynomials positive 
on K, 

Theorem 2.1 Let go = ^ and K in il.2\) be compact. 

(a) If the quadratic polynomial x i-^ M — |jx|p belongs to Q{gi, . . . ,gm) 
and i/ / G M[ x] is strictly positive on K then f (£ Q{gij • • • ^gm)- 

(b) Assume that < (^j < 1 on K for every j, and the family {gj, 1 — gj} 
generates M[x]. If f is strictly positive on K then 

for some finitely many nonnegative scalar s (c^/j). 

SOS-convex polynomial is such that its Hessian matrix is SOS, i.e., factors as 
L(x)L(x)^ for some rectangular matrix polynomial L. 
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2.2 Main result 



With K as in (|1.2p we make the following assumption: 

Assumption 1 K is compact and < < 1 on K for all j = 1, . . . ,m. 

Moreover, the family of polynomials {gj, 1 — Qj} generates the algebra M[x]. 

Notice that if K is compact and Assumption[T]does not hold, one may always 
rescale the variables Xi so as to have K C [0, 1]*^, and then add redundant 
constraints < Xj < 1 for all i = 1, . . . ,m. Then the family {gj, 1 — gj} 
(which includes xj and 1 — Xj for all j) generates the algebra M[x] and 
Assumption [T] holds. 

With (i G N and < A = (Xa^), (a,/3) G N^™, let A ^ Gd(A) be the 
function defined in ()1.6p . with associated problem: 

Pd = max{Gd(A) : A > 0}. (2.1) 

A 

Observe that Gd(A) < /* for all A > 0, and computing pd is just solving the 
Lagrangian relaxation of problem P^^ in ()1.7p . 

Theorem 2.2 Let K be as in U.^) . f G M[x], d G N, and let Assumption\^ 
hold. Consider problem \2. 1]) associated with P and with optimal value pd- 
Then the sequence (pd), d G N, is monotone nondecreasing and pd — ?> /* as 
d — >■ CO. 

Proof. We first prove that pd+i > Pd for all d, so that the sequence (pd), 
(i G N, is monotone nondecreasing. Let < A = (A^^) with (a,/3) G N^"^. 
Then < A with Xa/s = Xap whenever (a,/3) G N^™, and Xaj3 = whenever 
|a + /3| > d, is such that Gd+i{X) = Gd{X) and so Pd+i > Pd- Next, let 
e > be fixed, arbitrary. The polynomial / — /* + e is positive on K and 
therefore, by [E], [I2l Theorem 2.23], 

(m \ I in \ 

IK' rid 
j=i / \i=i / 

for some nonnegative vector of coefficients c*^ = (c^^). Equivalently, 

(m \ j m \ 

Hf? 0(1-*)"' =(/'-)• 
i=i / \i=i / 
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Letting 

de := max {|a + /3| : c^^ > 0}, 

a, 13 

we obtain /* > Gd^c") = /* - e. And so 

r > max{Gd,(A) : A > 0} > /* - e. 

A 

As e > was arbitrary, the desired result follows. □ 

Corollary 2.1 Let K 5e as in Assumption (OP /loW and let P^, d € N, 

6e as in \1. 7\ ). Then for every e > there exists G N such that for every 
d > d^, the Lagrangian relaxation ofY^^, yields a lower hound f* — e < pa < 
f*. 

This follows from Theorem 12.21 and the fact that computing is just solv- 
ing the Lagrangian relaxation associated with P^. So the interpretation 
of Corollary 12.11 is that the Lagrangian relaxation technique in non convex 
optimization can provide a lower bound as close as desired to the global 
optimum /* provided that it is applied to an equivalent formulation of P 
that contains sufficiently many redundant constraints which are products of 
the original ones. It also provides a rigorous rationale for the well-known 
fact that adding redundant constraints helps solve P. Indeed, even though 
the new problems P^, d G N, are all equivalent to P, their Lagrangian 
relaxations are not equivalent to that of P. 

2.3 Convex programs 

In this section, the set K is not assumed to be compact. 

Theorem 2.3 Let K be as in and assume that f and —gj are convex, 
j = l,...,m. Moreover, assume that Slater's conditioi^ holds and f* > 
— oo. 

Then the hierarchy of convex relaxations ( [i.5|) has finite convergence at 
step d = 1, i.e., pi = f*, and pi = Gi{X*) for some nonnegative X* G M™. 

^Slater's condition holds for P if there exists xo £ K such that gj{xo) > for every 
j = l,...,m. 
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Proof. This is because the dual method apphed to P (i.e. Pi) converges, 



I.e. 



max < 

A>0 



mm{/(x) -^Aj5-j(x)} 



= max{Gi(A) : A > 0} = pi. 
A 

Next, let A*^"^ be a maximizing sequence, i.e., Gi(A'-"'^) — ?> /* as n ^ oo. 
Since Slater's condition holds (say at some xq G K), one has 

m 

Gi(A(0)) < Gi(A(")) < /(xo)-5]Af)5,(xo), 

for all n, and so A^"^ < (/(xq) — Gi(A'^'^)))/(7j(xo) for every j = 1, . . . ,m, and 
all n > 1. So there is a subsequence (n^). A; e N, and A* G W^, such that 
Xi^k) _^ _\* > as A; — > oo. Finally, let x G M" be fixed, arbitrary. From 

m 

Gi(A("^)) < /(x)- J;aJ"'^)5,(x), Vfc, 
letting A; — )■ oo yields 

m 

r < /(x)-^A*5.(x). 

As X G was arbitrary, this proves Gi(A*) > /*, which combined with 
Gi(A*) < /* yields the desired result Gi(A*) = /*. □ 

Observe that this does not hold for the LP-relaxations (jl.4p where gener- 
ically 6d < f* for every d G N; see e.g. [TU|[T^. 



3 A parametrized hierarchy of 
semidefinite relaxations 

Problem (j2.ip is convex but in general the objective function Gd is non 
differentiable. Moreover, another difficulty is the computation of Grf(A) for 
each A > since Grf(A) is the global optimum of the possibly non convex 
function (x. A) i— )■ Lrf(x, A) defined in (jl.Sp . So one strategy is to replace (|2.ip 
by a simpler convex problem (while preserving the convergence property) as 
follows. 
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3.1 Interpreting the LP-relaxations 

Observe that the LP-relaxations (|1.4|) can be written 

9d = max { t : Lrf(x, A) - t = 0, Vx G M" } , (3.1) 

where L^i has been defined in (jl.Sp . 

And so the LP-relaxations (jl.4p can be interpreted as simplifying (j2.ip by 
restricting the nonnegative orthant {A : A > 0} to its subset of A's that make 
the polynomial x i— )• L(x, X) — t constant and equal to zero, instead of being 
only nonnegative. This subset being a polyhedron, solving (j3.ip reduces to 
solving a linear program. At first glance, such an a priori simple and naive 
brute force simplification might seem unreasonable (to say the least). But 
of course the LP-relaxations ()1.4p where not defined this way. Initially, the 
Sherali- Adams' RLT hierarchy [T7] was introduced for 0/1 programs and 
finite convergence was proved by using ad hoc arguments. But in fact, the 
rationale behind convergence of the more general LP-relaxations ()1.4p is the 
Krivine-Stengle positivity certificate [12^ Theorem 2.23]. 

However, even though this brute force simplification still preserves the 
convergence 9d — >■ /* thanks to [121 Theorem 2.23], we have already men- 
tioned that it also implies serious theoretical (and practical) drawbacks for 
the resulting LP-relaxations (like slow asymptotic convergence for convex 
problems and numerical ill-conditioning). 

3.2 A parametrized hierarchy of semidefinite relaxations 

However, inspired by this interpretation we propose a systematic way to 
improve the LP-relaxations (jl.4p along the same lines but by doing a much 
less brutal simplification of ()2.ip . Indeed, one may now impose on the same 
nonnegative polynomial x i->- -L(x, A) — t to be a sum of squares (SOS) 
polynomial a of degree at most 2k (instead of being constant and equal to 
zero as in (j3.ip ). and solve the resulting hierarchy of optimization problems: 

= max t 

" X,t,cr 

s.t. Ld(x, X)-t = a, Vx G > (3-2) 
A > 0, ae S[x]fc 

with d = 1, 2, . . ., and parametrized by k, fixed. (Recall that S[x]fc denotes 
the set of SOS polynomials of degree at most 2k.) To see that ()3.2p is a 
semidefinite program, write 

x^Ld(x,A)-t := Y,L^{X,t)^^, 
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where s = dmaxj[deggj] and Lp[\,t) is linear in (A,t) for each (3 G N^. 

Next, for A; G N such that 2k < s, let Vfc(x) be the vector of the monomial 
basis (x'^), /3 G N^, of M[x]fc, and write 

Vfc(x)vfc(xf = ^ x^B/3, 



for some appropriate real symmetric matrices (B^), /3 G ^2k- Then problem 
(|3.2|) is the semidefinite program: 

Oj = max t 

A t Q 

s-t. Lf,iX,t) = (B^,Q), V/3gN^, (3 3) 

L^{X,t) = 0, V/3gN^, \(3\>2k 

A > 0; Q = >r 0, 

where Q is a ("^'^) x ("^'^) real symmetric matrix. 

Of course q'^ > 9^ (= q^) for all d because with a = one retrieves ()1.4p . 
Moreover in the semidefinite program ()3.3p . the semidefinite constraint Q >z 
is concerned with a real symmetric {"'^^) x ("'^'^) matrix, independently 
of the rank d in the hierarchy. For instance if /c = 1 then o" is a quadratic 
SOS and Q has size (n + 1) x (n + 1). In other words, even if the number 
of variables A = (A^/j) increases fast with d, the LMI constraint Q ^ has 
fixed size, in contrast with the semidefinite relaxations (jl.Sp where the size 
of the LMIs increases with d. And it is a well-known fact that crucial for 
solving semidefinite program is the size of the LMIs involved rather than 
the number of variables. 



3.3 Sherali-Adams' RLT for 0/1 programs 

Consider 0/1 programs with / G M[x], and feasible set K = {x : Ax < 
b} n {0, 1}", for some real matrix A G M™^" and some vector b G M"^. The 
Sherali- Adams's RLT hierarchy jl7j belongs to the family of LP-relaxations 
(|1.4p but with a more specific form since K C [0, 1]". Notice that the 
family {l,2;i,(l — xi), . . . , x„, (1 — Xn)} generates the algebra M[x]. Let 
gi{x) = {b- Ax)£, ^ = 1, . . . , m, and fifoW = 1. 

Following the definition of the Sherali-Adams' RLT in [iTj . the resulting 
linear program at step d in the hierarchy reads: 
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m 

+ Y1 Yl ^'u 9e{^)ll^illi^ - ^j)-^ 

i=0 I,.JC{l,...,n} iel j£j 

Ir\J=ll);\IUJ\<d 

/iieM[x]d-i i = l,...,n}, (3.4) 

where A is the nonnegative vector (A|j). (If there are hnear equahty con- 
straints ^^(x) = the corresponding variables Ajj are not required to be 
nonnegative.) So all products between the (7^'s are ignored (see the para- 
graph before Lemma 1 in [TTJ p. 414]) even though they might help tighten 
the relaxations. In the literature the dual LP of (j3.4p is described rather 
than ([331) itself. 

In this context, the problem equivalent to P and defined in (jl.7p by 
adding redundant constraints formed with products of original ones, reads: 

min{/(x) : x" 3;j(l - Xj) = 0, j = 1, . . . , n; a G N;|_^; 

S'^W J|a;j J|(l - Xj) > 0, i = 0,...,m, 

i€l jeJ 

/, J C {1, . . . , n}; / n J = 0; |/ U J| < d}. 
Hence the 0/1 analogue of (|3.2p reads 

= max < t : /(x) - t = fT(x) + ^ /ii(x) Xi{l - Xi) 

I 1=1 

m 

i=0 I,JC{l,...,n.} iel j£j 

inJ=i-\iuJ\<d 

fj G i;[x]fc; hi eR[x]d-i i = l,...,n}. (3.5) 

For 0/1 programs with linear or quadratic objective function, and for ev- 
ery A: > 1, the first semidefinite relaxation (|3.5p . i.e., with d = 2, is at least as 
powerful as that of the standard hierarchy of semidefinite relaxations (|1.3p . 
Indeed ()3.5p contains products gi(x)xj or g£(x){l — Xk), for all {£,j, k), which 
do to not appear in ()1.3p with d = 1. And so in particular, the first such 
relaxation for MAXCUT has the celebrated Goemans- Williamson's perfor- 
mance guarantee while the standard LP-relaxations (jl.4p do not. On the 
other hand, for 0/1 problems and for the parameter value k = 1, the hi- 
erarchy ()3.5p is what is called the Sherali-Adams + SDP hierarchy (basic 
SDP-relaxation + RLT hierarchy) in e.g. Benabas and Magen j|3j and Ben- 
abbas et al. [2]; and in [3l[2] the authors show that any (constant) level d of 
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this hierarchy, viewed as a strengthening of the basic SDP-relaxation, does 
not make the integrahty gap decrease. 

In fact, and in view of our previous analysis, the "Sherah-Adams + 
SDP" hierarchy should be viewed as a (level k = 1 ^-strengthening of the 
basic Sherali- Adams' LP-hierarchy (|3.4|) rather than a strengthening of the 
basic SDP relaxation. 

4 Comparing with standard 
LP-relaxations 

As asked in introduction: 

What do we gain by going from the LP hierarchy to the semidefinite 

hierarchy \3. 3)) parametrized by k? Some answers are provided below. 

4.1 Convex problems 

Recall that a highly desirable property for a general purpose method aiming 
at solving NP-hard optimization problems, is to behave efficiently when 
applied to a class of problems considered relatively easy to solve. Otherwise 
one might raise reasonable doubts on its efficiency for more difficult problems 
not only in a worst-case sense but also in average. And convex problems P 
as in ()l.ip - ()1.2p . i.e., when f,—gj are convex, form the most natural class 
of problems which are considered easy to solve by some standard methods 
of Non Linear Programming. 

Theorem 4.1 With P as in U.1\) - [T7^) let f, —gj be convex, j = 1, . . . ,m, 
let Slater's condition hold and let f* > — oo. Then: 

(a) //max[deg/, deg^j] < 2 then g| = /*, i.e., the first relaxation of 
the hierarchy 113. 2\} parametrized by k = 1, is exact. 

(a) //max[deg/, deg^j] < 2k and /, —gj are all SOS-convex, then q\ = 
f*, i.e., the first relaxation of the hierarchy ^3. 2\) parametrized by k, is exact. 

Proof. Under the assumptions of Theorem 14.11 P has a minimizer x* € 
K and the Karush-Kuhn- Tucker optimality conditions hold at (x* , A* ) € 
K X for some A* € M™. And so if A; = 1, the Lagrangian polynomial 
Li(-,A*) — /* is a nonnegative quadratic polynomial and so an SOS a* G 
S[x]i. Therefore as < f* for all d, the triplet (A*,/*, a*) is an optimal 
solution of p.2p with k = d = 1, which proves (a). 

Next, if A; > 1 and /, —gj are all SOS-convex then so is the Lagrangian 
polynomial Li(-, A*) — /*. In addition, as Vxii(x*, A*) = and Li(x*, A*) — 
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/* = 0, the polynomial Li(-,A*) — /* is SOS; see e.g. Helton and Nie [8l 
Lemma 4.2]. Hence Li{-, A*) — f* = a* for some a* € S[x]fc, and again, the 
triplet (A*,/*,(T*) is an optimal solution of (|3.2p with d = 1, which proves 
(b). □ 

Hence by simplifying (jl.Sp in a less brutal manner than in ()1.4p one 
recovers a nice and highly desirable property for the resulting hierarchy. The 
price to pay is to pass from solving a hierarchy of LPs to solving hierarchy 
of semidefinite programs; however the increase in complexity is controlled 
by the parameter k since the size of the LMI in the semidefinite program 
(|3.3p is 0{n^), independently of the rank d in the hierarchy. 

4.2 Obstructions to Exactness 

On the other hand, for non convex problems, exactness at level-d of the 
hierarchy ()3.2p . i.e., finite convergence after d rounds, still implies restrictive 
conditions on the problem: 

Corollary 4.1 Let P be as in /il.l\ )- [T^) and let AssumptionUl hold. Let 
X* € K 6e a global minimizer and let /i(x*) := {j S {1, . . . ,m} : gj{^*) = 
0} and /2(x*) := {j G {1, • • • ,m} : (1 - gj{x*)) = 0} be the set of active 
constraints at x*. Let < /c E N 6e fixed. 

The level-d semidefinite relaxation i3. 2\) is exact only if f* (resp. x* G 
zs also the global optimum (resp. a global minimizer) for the problem 

min{/(x) : x € V}, (4.1) 

X 

where V C M" (see i4.2\ ) below) is a variety defined from some products of 
the polynomials gj 's and (1 — gj) 's. And if k = then f must be constant 
on the variety V. 

Proof If ([312]) is exact at level d G N, then 

Lrf(x,A*)-r =a(x), VxGM", 
for some A* > and some a G S[x]^.. Equivalently, 

/(x) - /* = a(x) + E n ) X 

(a,/3)eN2™ \i=i / 

m \ 
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Then evaluating at x = x* yields o"(x*) = and 



>o 



3j G -^i(x*) s.t. aj > 0, or 
3j G /2(x*) s.t. I3j > 0. 



So let n := {(a,/3) G N^™ : A^^ > 0} and for every (q,/3) G let 
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{jG/i(x*) : a, >0}, 
- {j e ^2(x*) : /3, > 0}. 
Next, define V C M" to be the real variety: 

f _ \ / 

{x G : 



n ^,(x) n 

y {a, 13) en}. 



0, 



(4.2) 



Then for every x G V, one obtains /(x) — /* = o"(x) > 0, which means that 
/* is the global minimum of / on V. If A; = then a is constant and equal 
to zero. And so /(x) - /* = for all x G V. □ 

Hence Corollarv 14. 1 1 shows that exactness at some step d of the hierarchy 
(|3.2|) imposes rather restrictive conditions on problem P. Namely, the global 
optimum /* (resp. the global minimizer x* G K) must also be the global 
optimum (resp. a global minimizer) of problem (|4.ip . For instance, suppose 
that only one constraint, say gki^) > 0, is active at x*. Then /* (resp. x*) 
is also the global minimum (resp. a global minimizer) of / on the variety 
{x : gk{x) = 0}. And if A; = then / must be constant on the variety V! 

Example 1 If K. is the (compact) polytope {x : ajx < 1, j = l,...,m} 

for some vectors (a^) C M", then invoking a result by Handelman /?/, one 
does not need the polynomials {1 — gj} in the definition il.8\} of L^. So for 
instance, suppose that /i(x*) = {£} at a global minimizer x* G K. Then 
exactness at some step d of the hierarchy iS. 0) imposes that f* should also 
be the global minimum of f on the whole hyperplane V = {x : aj'x = 1}; 
for non convex functions f , this is a serious restriction. Moreover, if k = 
then f must be constant on the hyperplane V. 

Concerning exactness for 0/1 polynomial optimization: 

Corollary 4.2 Let K = {x : Ax < b} n {0, 1}" and let x* G K 6e an 

optimal solution of f* = min{/(x) : x G K}. Assume that Ax* < b, i.e., 
no constraint is active at x* . 
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(a) The Sherali- Adams' RLT relaxation is exact at step d in the 
hierarchy only if f{x) = /(x*) = /* for all x in the set 

V:={xG{0,ir : HxiHil - xj) = 0, {I,J)eQ}, 

iei jeJ 

where Q is some finite set of couples (/, J) satisfying InJ = and \IUJ\ < d. 

(b) Similarly, the semidefinite relaxation 113. 5|) is exact at step d only if 
X* is also a global minimizer o/min{/(x) : x S V} for some V as in (a). 

Proof, (a) Exactness implies that the polynomial x i— )• /(x) — /* has the 
representation described in (|3.4p for some polynomials (hi) C M[x]rf_2 and 
some nonnegative scalars (A|j). Evaluating both sides of (|3.4p at x = x* 
and using ^^(x*) > for all ^ = 0, . . . , m, yields 

Afj>0 =^ l[x;llil-x*)=0. (4.3) 

iGi jeJ 

Let V be as in Corollary g^] with Q := {{I, J) : 3i s.t. Afj > 0}. Then 
from the representation of x i— > /(x) — /* in ()3.4p we obtain f(x) — /* = 
for all X € V and the result follows. For (b) a similar argument is valid 
but now using the representation of /(x) — /* described in (|3.5p . And so 
exactness yields ()4.3p as well as o"(x*) = 0. Next, for every x G V we now 
obtain /(x) — f* = o"(x) > because a is SOS. □ 

The constraints Ax < b play no explicit role in the definition of the set 
V. Moreover, if / discriminates all points of the hypercube {0, 1}" then 
exactness of the Sherali-Adams' RLT implies that V must be the singleton 
{x*}. 

On the hierarchy of semidefinite relaxations 

Similarly, the hierarchy of semidefinite relaxations (II. 3p also has an inter- 
pretation in terms of simplifying an extended Lagrangian relaxation of P. 
Indeed consider the hierarchy of optimization problems 

ujd ■= max {H{ai, . . . , am) ■ deg((T,- gj) < 2d, aj G S[x] 

j = l,...,m}, (4.4) 

d G N, where a i->- H{ai, . . . , am) is the function 

m 

H{ai,...,am) ■■= niin {/(x) - ^o-j(x)5-j(x) }. 



17 



For each d € N, problem ()4.4p is an obvious relaxation of P and in fact is an 
extended Lagrangian relaxation of P where the multipliers are now allowed 
to be SOS polynomials with a degree bound, instead of constant nonnegative 
polynomials (i.e., SOS polynomials of degree zero). 
If K is compact and the quadratic module 

m 

Qia) ■■= (^<^j9j ■ G S[x], j = 0, 1,... ,m} 

i=o 

(where go = 1) is Archimedean, then uj^ ^ f* as d ^ oo. But of course, 
and like for the usual Lagrangian, minimizing the extended Lagrangian 

m 

X L(x,cr) := /(x) - ^cjj(x)5j(x), 

i=i 

is in general an NP-hard problem. In fact, writing (|4.4|) as 

m 

= max{t : /(x) - ^ ctj (x) gj (x) - t > Vx ; 

i=i 

deg (cTjC/j) < 2d], 

the semidefinite relaxations (jl.3p simplify (j4.4p by imposing on the nonneg- 
ative polynomial x i-^ /(x) — (jj(x)g(j(x) — t to be an SOS polynomial 
ctq G S[x]f; (rather than just being nonnegative). 

But the spirit is different from the LP-relaxations as there is no prob- 
lem Prf obtained from P by adding finitely many redundant constraints and 
equivalent to P. Instead of adding more and more redundant constraints 
and doing a standard Lagrangian relaxation to P^, one applies an extended 
Lagrangian relaxation to P with SOS multipliers of increasing degree (in- 
stead of nonnegative scalars). And in contrast to LP-relaxations, there is 
no obstruction to exactness (i.e., finite convergence). In fact, it is quite the 
opposite since as demonstrated recently in Nie [15], finite convergence is 
generic! 

5 Conclusion 

We have shown that the hierarchy of LP-relaxations ()1.4|) has a rather sur- 
prising interpretation in terms of the Lagrangian relaxation applied to a 
problem P equivalent to P (but with redundant constraints formed with 
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product of polynomials defining the original constraints of P). Indeed it con- 
sists of the brute force simplification of imposing on a certain nonnegative 
polynomial to be the constant polynomial equal to zero, a very restrictive 
condition. 

However, inspired by this interpretation, one has provided a systematic 
strategy to improve the LP-hierarchy by doing a much less brutal simplifi- 
cation. That is, one now imposes on the same nonnegative polynomial to be 
an SOS polynomial whose degree k is fixed in advance and parametrizes the 
whole hierarchy. Each convex relaxation is now a semidefinite program but 
whose LMI constraint has fixed size 0{n'^). Hence, the resulting families 
of parametrized relaxations achieve a compromise between the hierarchy of 
semidefinite relaxations (|1.3p limited to problems of modest size and the 
LP-relaxations (II. 4p that theoretically can handle problems of larger size 
but with a poor behavior when applied to convex problems. 
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